Variable selection methods for model-based clustering
نویسندگان
چکیده
منابع مشابه
Variable Selection for Model-Based Clustering
We consider the problem of variable or feature selection for model-based clustering. We recast the problem of comparing two nested subsets of variables as a model comparison problem, and address it using approximate Bayes factors. We develop a greedy search algorithm for finding a local optimum in model space. The resulting method selects variables (or features), the number of clusters, and the...
متن کاملPairwise variable selection for high-dimensional model-based clustering.
Variable selection for clustering is an important and challenging problem in high-dimensional data analysis. Existing variable selection methods for model-based clustering select informative variables in a "one-in-all-out" manner; that is, a variable is selected if at least one pair of clusters is separable by this variable and removed if it cannot separate any of the clusters. In many applicat...
متن کاملVariable selection in model-based clustering using multilocus genotype data
We propose a variable selection procedure in model-based clustering multilocus genotype data. Indeed, it may happen that some loci are not relevant for clustering into statistically different populations. Inferring the number K of clusters and the relevant clustering subset S of loci is regarded as a model selection problem. The competing models are compared using penalized maximum likelihood c...
متن کاملPenalized Model-Based Clustering with Application to Variable Selection
Variable selection in clustering analysis is both challenging and important. In the context of modelbased clustering analysis with a common diagonal covariance matrix, which is especially suitable for “high dimension, low sample size” settings, we propose a penalized likelihood approach with an L1 penalty function, automatically realizing variable selection via thresholding and delivering a spa...
متن کاملNon-parametric Machine Learning Methods for Clustering and Variable Selection
Qian Liu: Non-parametric machine learning methods for clustering and variable selection (Under the direction of Eric Bair) Non-parametric machine learning methods have been popular and widely used in many scientific research areas, especially when dealing with high-dimension low sample size (HDLSS) data. In particular, clustering and biclustering approaches can serve as exploratory analysis too...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Statistics Surveys
سال: 2018
ISSN: 1935-7516
DOI: 10.1214/18-ss119